Loading Libraries

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.4     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Formatting plots

Problem 1

Instacart Dataset Descriptions

  • The Instacart dataset includes 15 variables: order_id, product_id, add_to_cart_order, reordered, user_id, eval_set, order_number, order_dow, order_hour_of_day, days_since_prior_order, product_name, aisle_id, department_id, aisle, department.
  • The dataset has 1384617 rows and 15 columns.
  • The range of days since the prior order is (0, 30).
  • The median number of items added to cart order is (7).
  • The total number of times an item was reordered is 828824.

Aisles: Count and Most Ordered

## # A tibble: 1 × 1
##   distinct_aisles
##             <int>
## 1             134
  • There are 134 different aisles.

Instacart: Table

aisle product_name n rank
baking ingredients Cane Sugar 336 3
baking ingredients Light Brown Sugar 499 1
baking ingredients Pure Baking Soda 387 2
dog food care Organix Chicken & Brown Rice Recipe 28 2
dog food care Small Dog Biscuits 26 3
dog food care Snack Sticks Chicken & Rice Recipe Dog Treats 30 1

Icecream: Table

product_name 0 1 2 3 4 5 6
Coffee Ice Cream 13.77419 14.31579 15.38095 15.31818 15.21739 12.26316 13.83333
Pink Lady Apples 13.44118 11.36000 11.70213 14.25000 11.55172 12.78431 11.93750

Problem 2

Problem 3

## Rows: 35 Columns: 1443
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr    (1): day
## dbl (1442): week, day_id, activity.1, activity.2, activity.3, activity.4, ac...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Accelerometer Dataset Description

  • The Accelerometer dataset includes variables week_day, week, day_id, day, and activity.
  • The dataset has 35 rows and 1443 columns.
  • The range of activity on day 1 is (, -).
  • The total amount of activity on day 1 is 0.

Aggregating Accelerometer Dataset

day_id sum_activity
1 480542.62
2 78828.07
3 376254.00
4 631105.00
5 355923.64
6 307094.24